Graph Based Classification Methods Using Inaccurate External Classifier Information
نویسندگان
چکیده
In this paper we consider the problem of collectively classifying entities where relational information is available across the entities. In practice inaccurate class distribution for each entity is often available from another (external) classifier. For example this distribution could come from a classifier built using content features or a simple dictionary. Given the relational and inaccurate external classifier information, we consider two graph based settings in which the problem of collective classification can be solved. In the first setting the class distribution is used to fix labels to a subset of nodes and the labels for the remaining nodes are obtained like in a transductive setting. In the other setting the class distributions of all nodes are used to define the fitting function part of a graph regularized objective function. We define a generalized objective function that handles both the settings. Methods like harmonic Gaussian field and local-global consistency (LGC) reported in the literature can be seen as special cases. We extend the LGC and weighted vote relational neighbor classification (WvRN) methods to support usage of external classifier information. We also propose an efficient least squares regularization (LSR) based method and relate it to information regularization methods. All the methods are evaluated on several benchmark and real world datasets. Considering together speed, robustness and accuracy, experimental results indicate that the LSR and WvRN-extension methods perform better than other methods.
منابع مشابه
Using Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملUsing Fuzzy LR Numbers in Bayesian Text Classifier for Classifying Persian Text Documents
Text Classification is an important research field in information retrieval and text mining. The main task in text classification is to assign text documents in predefined categories based on documents’ contents and labeled-training samples. Since word detection is a difficult and time consuming task in Persian language, Bayesian text classifier is an appropriate approach to deal with different...
متن کاملSupport Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کاملIdentification of mild cognitive impairment disease using brain functional connectivity and graph analysis in fMRI data
Background: Early diagnosis of patients in the early stages of Alzheimer's, known as mild cognitive impairment, is of great importance in the treatment of this disease. If a patient can be diagnosed at this stage, it is possible to treat or delay Alzheimer's disease. Resting-state functional magnetic resonance imaging (fMRI) is very common in the process of diagnosing Alzheimer's disease. In th...
متن کاملMalware Detection using Classification of Variable-Length Sequences
In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1206.5915 شماره
صفحات -
تاریخ انتشار 2009